Identifying translationese at the word and sub-word level

نویسندگان

  • Ehud Alexander Avner
  • Noam Ordan
  • Shuly Wintner
چکیده

We use text classification to distinguish automatically between original and translated texts in Hebrew, a morphologically complex language. To this end, we design several linguistically informed feature sets that capture word-level and sub-word-level (in particular, morphological) properties of Hebrew. Such features are abstract enough to allow for the development of accurate, robust classifiers, and they also lend themselves to linguistic interpretation. Careful evaluation shows that some of the classifiers we define are, indeed, highly accurate, and scale up nicely to domains that they were not trained on. In addition, analysis of the best features provides insight into the morphological properties of translated texts.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Do We Need Discipline-Specific Academic Word Lists? Linguistics Academic Word List (LAWL)

This corpus-based study aimed at exploring the most frequently-used academic words in linguistics and compare the wordlist with the distribution of high frequency words in Coxhead’s Academic Word List (AWL) and West’s General Service List (GSL) to examine their coverage within the linguistics corpus. To this end, a corpus of 700 linguistics research articles (LRAC), consisting of approximately ...

متن کامل

Studying Translationese at the Character Level

This paper presents a set of preliminary experiments which show that identifying translationese is possible with machine learning methods that work at character level, more precisely methods that use string kernels. But caution is necessary because string kernels very easily can introduce confounding factors.

متن کامل

Evaluating the Success of the Visual Learners in Vocabulary Learning through Word List versus Sentence Making Approaches.

Thisstudy sought to evaluate the learners'''' achievements with the visual learning style when exposed to the sentence making and word list approaches. On that account, 45 basic level participants who studied at the Iran Language Institute (ILI), Bushehr, took part in this research study. At the outset, the learners were given Barsch learning style inventory (1991) to determine the learners''''...

متن کامل

The Effect of Mnemonic Key Word Method on Vocabulary Learning and Long Term Retention

Most of the studies on the key word method of second/foreign language vocabulary learning have been based on the evidence from laboratory experiments and have primarily involved the use of English key words to learn the vocabularies of other languages. Furthermore, comparatively quite limited number of such studies is done in authentic classroom contexts. The present study inquired into the eff...

متن کامل

Evaluating the Success of the Visual Learners in Vocabulary Learning through Word List versus Sentence Making Approaches

Thisstudy sought to evaluate the learners' achievements with the visual learning style when exposed to the sentence making and word list approaches. On that account, 45 basic level participants who studied at the Iran Language Institute (ILI), Bushehr, took part in this research study. At the outset, the learners were given Barsch learning style inventory (1991) to determine the learners' learn...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • DSH

دوره 31  شماره 

صفحات  -

تاریخ انتشار 2016